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Abstract 

When faulty sensors are rare in a network, diagnosing sensors individually is inefficient. This 
study introduces a novel use of concepts from group testing and Kalman filtering in detecting 
these rare faulty sensors with significantly fewer number of tests. By assigning sensors to groups 
and performing Kalman filter-based fault detection over these groups, we obtain binary detection 
outcomes, which can then be used to recover the fault state of all sensors. We first present this 
method using combinatorial group testing. We then present a novel adaptive group testing method 
based on Bayesian inference. This adaptive method further reduces the number of required tests and 
is suitable for noisy group test systems. Compared to non-group testing methods, our algorithm 
achieves similar detection accuracy with fewer tests and thus lower computational complexity. 
Compared to other adaptive group testing methods, the proposed method achieves higher accuracy 
when test results are noisy. We perform extensive numerical analysis using a set of real vibration 
data collected from the New Carquinez Bridge in California using an 18-sensor network mounted on 
the bridge. We also discuss how the features of the Kalman filter-based group test can be exploited 
in forming groups and further improving the detection accuracy. 


1 Introduction 

Wireless sensor networks (WSNs) have been successfully used in many applications such as structural 
health monitoring [2], environmental monitoring [3] and vehicle tracking [3]. With the increasing use 
of small, low power and low cost sensors, it has also become increasingly critical to ensure the accuracy 
and integrity of the measured data as low cost sensors are error prone while the environment in which 
they are deployed may be harsh. Timely detection of malfunctioning sensors in a system allows the 
operator to correct affected sensor readings and arrange for replacement, both of which can prevent 
further deterioration of the network, and thus should be an essential functionality of a WSN. 

Over the past decade, detection of malfunctioning sensors has been studied extensively in many 
different application contexts. Malfunctioning can be classified into two levels. The first is sensor 
failure, whereby sensors become irresponsive or cease to provide data, see e.g., m\- The second is 
sensor faulting, whereby sensors continue to report measurements but the data are intermittently or 
permanently corrupted. Sensor fault detection is generally more difficult than sensor failure detection 

A preliminary version of part of this work appeared in the IEEE International Conference on Computing in Sensor 
Systems (DCOSS) 2013 [1]. 
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because it is typically harder to assess the accuracy of data than it is to determine its absence. In this 
paper our focus is on the former. 

Sensor fault detection methods can be further classified as model-based and model-free. Model- 
based fault detection methods rely on a model capturing the dynamics of the system being monitored; 
this model can be obtained either from physical properties of the system (e.g., a state-space model) or 
from learning the parameters of a designated model (e.g., a Markov or autoregressive model). Both 
Kobayashi et al. [8] and Da et al. [9] proposed centralized detection algorithms which assume a 
state-space model of the system is available, and a bank of Kalman filters is used to detect faulty 
sensors. Both methods assume there is at most one faulty sensor at any given time and make use 
of the remaining sensors as references. A more detailed and quantitative comparison is given in 
Section El Li et al. m proposed an algorithm that requires fault-free sensors be designated a 
priori as reference sensors, and the number of reference sensors is required to be more than the 
number of uncertain sensors (i.e., those in unknown fault state). Their algorithm constructs analytical 
relationship between the output of each uncertain sensor and that of all reference sensors, which is 
then used for detection. Ricquebourg et al. m modeled sensor dynamics using a Markov chain under 
a transferable belief framework when the whole system is healthy. Once the model is established, 
any sensor outputs inconsistent with the model are further analyzed using predefined decision rules. 
Lo et al. |12] proposed a decentralized algorithm which is able to identify spike faults in addition 
to detecting general faults. Under this method pairs of sensors cross-validate each other using their 
measurements and an autoregressive with exogenous input model (ARX) trained a priori-, this method 
does not require reference sensors or a priori knowledge of the system model. 

Model-free fault detection methods do not require a dynamical model and usually rely on the 
assumption that sensors in close proximity observe similar dynamics. As a result, the density of the 
sensors needs to be high relative to the fluctuation of the signals being monitored. For instance, Ding 
et al. m and Chen et al. m suggested similar model-free sensor fault detection methods, where 
each sensor’s output is compared with its neighbors’. A sensor that deviates significantly from its 
neighbors is identified as faulty. Koushanfar et al. m proposed a cross-validation based fault detection 
algorithm that focuses on the impact of a particular sensor’s measurement on the consistency of the 
entire network’s measurement, under the assumption that an incorrect measurement will degrade the 
consistency. This algorithm removes one sensor at a time and evaluates how much the consistency of 
the system improves. The sensor whose removal improves the system most significantly is regarded 
as faulty and eliminated and the process is repeated until the system consistency cannot be improved 
anymore. 

All of the above mentioned fault detection methods require the number of tests at least on the 
order of the size of the network, i.e., 0{N) tests are required, where N is the number of sensors in 
the network. Some methods even need 0{mN) (where m is the number of neighbors of a sensor) or 
0{N'^) tests. A summary of the detection complexity is given in Table [TJ For applications using an 
extremely large number of sensors m, running a fault detection algorithm can involve a large amount 
of resources and cause significant delay. 

We observe that while certain regional effects or catastrophic failure may result in a large number of 
faulty sensors at the same time, in the absence of such systemic problems and during normal operation 
faults occur randomly and sporadically. This motivates us to seek lower complexity fault detection 
methods when faults may be rare and sparse. 

Toward this end, we introduce a novel use of group testing techniques combined with Kalman 
filtering in detecting faulty sensors in a network. Assuming that the underlying system being monitored 
may be represented in a linear dynamical system framework and that sensor faults are relatively rare, 
our goal is to reduce the number of required tests given requirements on detection and false positive 
probabilities. There have been a few studies on using group testing to detect malfunctioning sensors; 
they generally differ in the testing/detection methods. For instance, Goodrich and Hirschberg [18] 
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Method type 

Gomplexity 

Gondition needed 

Model-based: 

Kobayashi et al.[8| 

0{N) 

At most one faulty sensor 

Da et al.jS] 

0{N) 

At most one faulty sensor 

Lo et al.[T2| 

0{N) 


Li et al.flU] 

0{N) 

Reference sensor 

Ricquebourg et al.pTj 

0{N) 


Model-free: 

Ding et al.[T3] 

0{mN) 

m = ^ oi neighbors 

Ghen et al.|14j 

0{mN) 

m = ^ oi neighbors 

Koushanfar et al.[15j 

OiN^) 


Blough et al.[I7] 

0{NlogN) 



Table 1: Summary of existing methods 


evaluates a group of sensors by counting the number of responses from the group to a broadcast query 
(thus only applicable to sensor failure detection rather than fault detection), while Tosic et al. [I9] 
uses an unspecified dissimilarity comparison of neighboring sensors’ measurements. Our work differs 
from the former in that we focus on detecting faulty sensors which are still responsive to queries, and 
differs from the latter in that we do not assume that sensors are highly correlated or that neighboring 
sensors have similar measurements. 

Our approach consists of the following two components: the selection of a test group (also referred 
to as a test pool), and a Kalman filtering based testing/detection procedure over this group of sensors, 
which determines whether there exists at least one faulty sensor in this group. These two steps 
are repeated till desired performance criteria have been achieved. There are in general two ways of 
selecting the test groups. The first is open-loop, whereby the entire set of test groups are selected 
prior to performing any tests (this is done randomly in our study); this will be referred to as the 
combinatorial group testing (CGT) method. The second is closed-loop, whereby each test group 
is selected adaptively based on outcomes of previous tests (this adaptive section is done using the 
standard criteria of uncertainty reduction maximization in our study); this will be referred to as the 
Bayesian group testing (BGT) method. Both methods will be examined in this study. We will further 
consider the detection performance of Kalman filtering, and use such understanding in determining 
the selection of test groups under the Bayesian group testing method; this will be referred to as the 
Kalman filtering-enhanced Bayesian group testing method (KF-BGT). It should be emphasized that 
under all these methods the group tests (the second component) themselves are performed via Kalman 
filtering; they simply differ in how the test groups are selected (the first component). 

Existing adaptive group testing methods generally assumes error-free detection, thus an entire 
group of sensors is removed from further consideration when the test result is negative. Examples 
include Hwang’s generalized binary splitting algorithm [20], Allemann’s split-and-overlap algorithm 
|21j and Du et al.'s competitive GT algorithm |22j . Test errors have been considered in the literature 
of compressive sensing, (e.g., see [23l|2l]), which is closely related to group testing. However, these 
adaptive methods are not directly applicable to group testing as the latter is given by a Boolean 
operation whereas compressive sensing based test results are given by a linear operation. Our study 
further differs from both because our test results are given by a Kalman filtering based detection 
procedure (neither Boolean nor a linear operation), which is noisy and its result dependent on the 
design of the test and the detector. This raises significant challenge that we will address in this paper. 

The remainder of the paper is organized as follows: Section 2 reviews the main concepts used in the 
group testing-based fault detection algorithm. The detailed methodology of the detection algorithm 
based on GGT is explained in Section 3. The Bayesian group testing method BGT is described in 
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Section 4. Section 5 describes the experimental set up and the nature of a set of bridge vibration 
data we use for numerical evaluation. The performance of the CGT and BGT methods on the bridge 
vibration data is presented in Sections 6 and 7, respectively. The analysis of the Kalman filtering- 
enhanced version KF-BGT methods is presented in Section 8. Section 9 concludes the paper. 


2 Preliminaries 

In this section we review two main concepts used in our fault detection algorithm. The hrst is group 
testing, the goal of which is to identify sparse faulty items with fewer number of tests than the total 
number of items. The second concept is Kalman filtering, which is able to produce optimal state 
estimation for a linear dynamical system. 


2.1 Group Testing 

Gonsider a large number of items of which a few are defective. If each item is tested individually, 
the cost can be high (linear in the total number of items). However, if it is possible to determine the 
existence of a defective item in a group via a single group test, then performing a sequence of group 
tests over different subsets of these items can potentially lead to much fewer number of tests and thus 
much lower cost. This is the main idea of group testing; it was hrst proposed by Dorfman [25] during 
World War II for detecting syphilis amongst soldiers. 

Consider a length N signal S which is d sparse: this means S has at most d non-zero entries that 
correspond to the defective items and d N. As the “true” signal dimension (i.e., d) is smaller 
than N, it is conceivable that signal S can be acquired with M < N measurements. In group testing 
paradigm, signal S is measured M times in the form of IT = ^S, where <I> is the measurement matrix 
of size M X N. The arithmetic is boolean, meaning that the multiplication is logical AND and addition 
logical OR. If these operations are noisy, then the group test results are given by Z rather than W, 
with P{Zi = l|ITj = 0) = a and P{Zi = 0|ITj = 1) = /3, Vi, denoting the two types of errors. The 
goal of group testing is to design such that S can be reconstructed from Z (i.e., we can find the d 
defective items) with sufficiently low error probabilities. 

We now describe this in the context of a network of N sensors, of which at most d are faulty. Let 
vector S represent the true fault state of the sensors in the network, where 5* = 0 if sensor i is normal 
and S'* = 1 if sensor i is faulty. The row of the 0-1 matrix <h represents the set of sensors involved 
in the test, and is called a test group/pool denoted by <!>*; the number of rows equals the number of 
tests. Finally, the vector Z represents the result of the group tests. Below is a toy example of = Z: 


Example 2.1. 


0 10 0 11 
0 0 110 1 
10 0 110 


1 

0 

1 


In this example, there are 6 sensors; sensor 2 is faulty. A total of 3 group tests are performed: 
sensors {2, 5, 6} are included in the first test (first row of <I>), and so on. The test result shows correctly 
that the first group contains at least one faulty sensor and the second group has none, but declares 
incorrectly that the third group contains a faulty sensor. In a fault detection setting, S is unknown 
while is known by design and Z is known by observing the test results. 4> and Z are then used 
to reconstruct S. As mentioned in the introduction, group-testing a set of sensors in our context is 
far more complicated than a simple boolean operator, noise-free or noisy. To use this group testing 
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framework in practice, we must specify what a “group test” entails, and how to actually obtain values 
in the Z vector. This is addressed by a novel use of Kalman filtering detailed next. 

2.2 Kalman Filter Based Group Test 

The Kalman filter |26j is an algorithm which takes a series of noisy inputs and iteratively calculates a 
statistically optimal estimate of the state of an underlying linear dynamical system. More specifically, 
consider a linear dynamical system given by the following state-space model [26]: 


Xfc-i-i — AXfc -|- BUfc -|- Gfc (1) 

Yfc = CXfc + Vfc . (2) 

where the first equation represents the dynamics of the system while the second represents the (sensor) 
observation model. Here Xj, € is the state vector of the system, the input (or control) 

vector, and Y*, G the output vector of sensors. Matrices A, B and C are determined by the physics 
of the system as well as the sensors. G and V are Gaussian white noise with zero mean and covariance 
matrices Rg and Rv, respectively. Xq, Gk and are assumed to be independent. Assuming the 
noises G and V are small, the next system state, X^+i, primarily depends on the current system 
state, Xfc, and the current input Ufc, while the current output of the sensors, Y^, primarily depends 
on the current system state X^.. 

The Kalman filter state estimation can be separated into two steps, a prediction step and an update 
step. In the prediction step, the predicted state (of time k based on the value at time k — 1), 
and the corresponding uncertainty measure of the prediction, are calculated as follows: 

^k\k-i = AXfc_i|fc_i + BUfc (3) 

Pfc|fc-i = APfc_i|fc_iA'^-I-Rw , (4) 

Upon observing a measurement Y^, the estimated state and uncertainty measure are updated as 
follows: 


Kk = Pfc|,_iCT(CPfc|,_iCT + R)-i 

( 5 ) 

^k\k 1 T Kfc(Y/j CX^|^_]^) 

( 6 ) 

P k\k = (I “ KfcC)P k\k-l ) 

( 7 ) 


where the updated state, is a weighted sum of the estimated state and the innovation (Y^ — 

CX;j|fc_i). The weight depends on the uncertainty measure Pfc|fc_i: the more uncertain the estimated 
state is, the more weight is placed on the new observation. 

The group testing method requires the fault detection method to identify whether an arbitrary 
group of sensors contains any faulty member. The idea of using Kalman filtering for group testing lies 
in its ability to estimate the state of the underlying system from the observations of arbitrary sets of 
sensors. For example, if one wants to estimate the system state by using the outputs from sensors 1, 3 
and 4, the observation model (Eq. (I2|)) can be changed to Y'^ = C'Xfc + V)., where Y'^ contains 
only the and 3”^^ components of Y^ (V^) and C' contains the 1®* and 3”'^ rows of C. The dynamic 
equation of the system (Eq. ([T|)) remains the same. With this in mind, after selecting a test group 
of sensors <I>j, we can split it into two subgroups A and B, and use the observations from each subset 
to separately estimate the state of the underlying system (thus it is required that the test group 
contain at least two sensors), and check whether the two are consistent. 

Specifically, denote the estimated states of the system computed from observations of the subgroups 
''A '' B 

A and B as and X^|^_]^, respectively. The difference between the two estimated states is given 
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Figure 1: State diagram of the proposed sensor fault detection method. 


by: 

^ A '' B 

= ^k\k-i ~ ^k\k-i ■ ( 8 ) 

As all states estimated from the Kalman filter are unbiased (i.e., = X/j) [2U], the expected 

''A '' B 

difference E[ek] = — K[X^|;i._^] = 0 if neither A nor B contains any faulty sensor. Otherwise 

this expectation is non-zero. Therefore, a threshold can be used to decide whether a group of sensors 
contains any faulty sensors: if the difference between the two state estimates is larger than this 
threshold, then <l>j is regarded as having at least one faulty sensor and the corresponding entry in Z 
will be set to 1; otherwise the corresponding entry in Z is set to 0. Fig. [1] gives an overview of this 
approach. 

After obtaining the group test result Z, the sensor fault state is recovered by a straightforward 
maximum likelihood (ML) decoding. The recovery algorithm evaluates all ('^) possible fault states 
and chooses the one such that the group testing result Z is most likely, i.e., choose v* if 

P{Z\Ll) > P{Z\L,) Vz. + V* (9) 

where Ly denotes any possible fault state and v G {1, 2,... , (^)}. However, the probability 

measure in Eq. ([9]) may be difficult to obtain; in our case it depends on the threshold used in group 
testing. We will thus simply assume that each group test has the same false positive and false negative 
probabilities and use minimum distance decoding. For each possible fault state Ly^ the recovery 
algorithm calculates the Hamming distance, defined as the number of distinct entries, between the 
predicted output ^Ly and the detection outcome Z. Fault states with smaller Hamming distance 
is preferred. Among fault states having the same Hamming distance from Z, states with a smaller 
support are preferred as the probability of a sensor being faulty is < 1/2. If this still results in a tie, 
then the recovery algorithm will choose randomly. 

3 A Combinatorial Group Testing Based Fault Detection Method 

In this section, a Combinatorial Group testing (CGT) based fault detection method is presented [T]. 
This section focuses on the design of the test groups or measurement matrix d>. The group test 
is preformed using Kalman filtering as described in Section 12.21 Consider a network of N sensors 
monitoring an underlying physical system that can be modeled as a linear dynamical system. Assume 
any sensor in the network can be faulty and that at most d of them are faulty at any given time. The 
dynamic evolution of the underlying system as well as observations by the sensors can be expressed 
similarly as in ([2]): 


Yfc — CXfc -|- Vfc -|- Efc, (10) 

where the additional vector is an unknown error vector induced by sensor faults: its component 
is zero if sensor i is not faulty. 
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3.1 Group Selection and the Number of Group Tests 

Recall the fault detection problem represented as Z = $5, where S represents the fault state of 
sensors (“1” means faulty). As the detection performance largely depends on our primary task is 
in determining the entries of i.e., which sensors include in each test. In this sub-section we focus 
on the non-adaptive CGT method, whereby <1 is designed prior to the tests. 

A common way of selecting test groups, which we adopt in this study, is to design a disjunct 
measurement matrix. A d-disjunct matrix has the property that for any d-|-1 columns, there is always 
a row with entry 1 in a column and zeros in all the other d columns. For instance, the measurement 
matrix in Example 12.11 is 1-disjunct (since any two columns differ in at least one row) but is not 
2-disjunct. The reason a d-disjunct matrix is desirable, especially in the case when group tests are 
error-free, is because its output vector Z is distinct for different d-sparse vectors S (a vector is d-sparse 
if it has at most d non-zero entries), which means that the exact recovery of a d-sparse fault state 
vector S is guaranteed with a d-disjunct One simple method to generate a d-disjunct measurement 
matrix <I> with high probability is to generate each entry randomly such that = 1 has probability 

1 / 2 . 

The quality of a measurement matrix is reflected in the number of tests needed (the number of 
rows in ‘h) to gain enough information in order to correctly recover the fault state S. If the group 
tests are error free and the faulty sensors are distributed uniformly at random, then the necessary 
and sufficient number of rows in <I> are n(dlog(A^/d)) and 0(dlog(A^)), respectively [27j. Under the 
worse-case distribution of faults (i.e., adversarial fault model), the necessary and sufficient number of 
rows in $ are ^(^t^^) and 0(d^ log(A^)), respectively [27] . 

The group tests in our problem is not error-free since detection using Kalman filtering is inherently 
noisy. Noisy group testing problems are relatively less studied than their noise-free counterpart. A 
recent study [28] evaluated the number of tests required for two noisy group testing scenarios: 1) 
Additive model, where a negative group test result may turn to positive with certain probability; and 
2) Dilution model, where a faulty sensor may act normal (diluted) with certain probability in a group 
test. The sufficient number of tests for the additive model and dilution model, under the worst-case 
distribution of faults, are shown to be 0{d? log(A')/(l — a)) and 0{d‘^ log(Al)/(l — /3)^), respectively. 
However, for group tests that can have both false alarm and miss detection, as in our case, the required 
number of tests remains an open question. 

3.2 Practical Implementation 

The method outlined above can be implemented in two ways. The first is as a post processing of data 
already collected at a cluster head or central location. The second is in a form of real-time sequential 
process, where a control center solicits input from a single group of sensors at a time. A single group 
test is then performed over this group of input. This is followed by soliciting input from the next 
group, and so on. Note that as long as the fault state of the underlying system remains unchanged, 
the fault state estimate can be done over different segments of observations over time. In other words, 
the data provided by each group need not be synchronized and can be generated on demand. 

4 Bayesian Group Testing 

We next present a novel adaptive group testing method based on Bayesian inference. The combinatorial 
group testing method presented in the previous section designs the entire set of tests (i.e., the entire 
<h) before carrying out any group test. The result of each group test, however, may provide valuable 
information on the sensor state. For instance, in the extreme case when group tests are error-free, 
a negative result implies that all items in that test are normal; thus no further test is required for 
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these items. By taking previous test results into account (i.e., adapt to the group test results), the 
sensor state may be identified with fewer number of tests compared to the combinatorial group testing 
method. This idea was adopted in several studies [20H22] . 

Our method maintains a probability measure on the sensor fault state vector, which is updated 
following each group test using Bayesian inference. The updated state estimate is then used to deter¬ 
mine the next test pool. This process is repeated until the change in the state estimates is sufficiently 
small. As we shall see, compared to existing adaptive group testing methods, our algorithm is designed 
specifically for noisy group tests so that errors do not propagate. 

In the following presentation, subscript k is used to denote the component (row) of a vector 
(matrix) and superscript k to denote the collection of a variable from time 1 to /c. Specifically, denote 
by <I>^ = {$ 1 , ^ 2 , ■ ■ ■, ‘hfc} the set of tests used up to time k, where is the k^^ row vector of <I>, 
and = {Zi, Z 2 ,..., Zk} the set of test results up to time k. Let S be the collection of all possible 
sensor fault states {S' = {Si, S 2 , ■ ■ ■, Sn) '■ Si € {0,1}}). We define two probability measures. The 
first is Ps^k = P{S\^^, Z^), the probability of the sensor state being S' G <S after the k^^ test; the 
second is Pi^k, the probability of sensor i being normal after the kk^ group test. By definition, we have 

Pi,k = Hs&S:Si=oPs,k- 

For the {k + 1)*^ test <hfc_|_i, it is desirable to select sensors such that the test result Zk+i provides 
the most information for the estimation of the true sensor state. Basic information theory result [29] 
tells us that maximizing the information content is equivalent to maximizing the variance of Zk+i- 
This criterion can be expressed as follows: 

= aiguia:^VAR[Zk+i\^k+i,{Ps,k}s&s\ ■ (11) 

'I’fe+i 

Zk+i conditioned on <hfc+i, {Ps,k}ses has a Bernoulli distribution. If we denote by klk the probability 
that all sensors in test pool are normal given the estimate after the observation, then the 

above variance is given as follows, noting that Zk+i = 0 either when all sensors in <hfc_|_i are normal 
and the group test is correct or when at least one sensor in is abnormal and the group test 

is incorrect, i.e., Zk+i = 0 with probability ((1 — a)Qk + /3(1 — Hfc)), and similarly Zk+i = 1 with 
probability (afifc + (1 ~ /3)(1 — Hfc))- 


VAR[Zk+i\^k+l, {-P5,fc}5G5] 

= ((1 — a)Qk + /3(1 — ^k)){(^^k + (1 ~ /3)(1 — Hfc)) 

= /? - /32 + (1 - 2/3)(l - a - /3)52fc - (1 - a - pfnl . (12) 

The above computation, however, is generally intractable due to the large state space S when the 
number of sensors is large. We thus adopt the following approximation by assuming conditional 
independence between different sensors’ fault states, i.e., 

P{Si,S 2 ,...,SN\{Ps,k}ses) = n , yk . (13) 

igAf 

With this assumption we have Qk = OiG^fc+i where we have used i G to mean that the 
component of <hfc+i is 1. 

While this assumption allows us to compute ()12p , finding the optimal solution to (jlip remains hard 
when the number of sensors is large. Toward this end we propose a greedy algorithm for choosing a good 
‘hfc+i efficiently, by observing from (1121) that its maximum is achieved when 11^ = (1—2/3)/(2(l—a—/3)). 
The greedy algorithm starts with a random sensor and calculates klk] in each successive step it selects 
a sensor such that the resulting new value of Qk is as close to (1 — 2/3)/(2(l — a — /3)) as possible. 
This is repeated until no additional sensor can bring Qk closer to (1 — 2/3)/(2(l — a — j3)). As klk 


is monotonically decreasing in the inclusion of new sensors, the algorithm is guaranteed to terminate 
with a new test pool. 

Having designed and observed Zk+i-, the probability Ps,k+i can be updated from Pg^k for all 
S ^s- 


Ps,k+i - P[.b\9 

_ S, ^^+^)P{S\^^, Z^)P{^^+^, Z^) 

~ P($fc+l^^fc+l) 

= P{Zk+i\Z\S,^’^+^)Ps,k/^k , 


(14) 


where A^. is the normalizing factor Z ^)/and is equal to Yls P{^k+i\Z^, S, ^’‘~^^)Ps^k- 

Note that P(Zfc+i|Z^, 5, = P{Zk+i\^k+iS) as .^fc+i only depends on the error-free test result 

^k+iS; recall the two type of errors are given by P{Zk+i = = 0) = a and P(Zfc_|_i = 

0|cl>fc+i5 = 1) =/3. 

To update the sensor state probabilities using (fTT|) for each 5 € 5 can be computationally pro¬ 
hibitive for large N (|5| = 2^). Below we show that using the conditional independence assumption 
we can instead update Pj,A:-i-i directly without calculating Ps^k+i, thus reducing the complexities from 
0{2^) to 0{N). We first calculate the normalization constant, and then update Pi^k+i accordingly. 

Given a test pool 4>fc+i, we will refer to the set of sensor states {5 : ^k+iS = 1} as the positive 
set, and {5 : <l>fc_|_i5 = 0} as the negative set. Note that by definition, we have Ps,k = ^k 

and Ps,k = 1 “ ^k- By separating S into these two sets, can be calculated as follows: 


Ak=Y,P{Zk+i\^k+iS)Ps,k 

s 

= Y1 P{Zk+,\<^k+,S)Ps,k+ P{Zk+,\<^k+,S)Ps,k 

S-.-S>k+iS=l S-.-S>k+iS=0 

=P(Zfc+i|$fc+i5 = 1)(1 - Qk) - P{Zk+i\<^k+iS = 0)Ofc (15) 


Therefore, if the test result is positive, Zk+i = 1, then A*, = (1 — /3)(1 — Qk) — a^lk', if the test 
result is negative, Z^+i = 0, then A^ = (/3)(1 — flfc) — (1 — a)Qk- 

We next show how Pi,A;-i-i is updated. If sensor i € 4>fc+i, then using (fT4)l we have 

Pi,k+i = ^ Ps,k+i = 1 — ^ Ps,k+i 

S:Si=0 S:Si=l 

= 1 “ X/ -^-S,fc-P(-^fc-|-l|‘hA:+l-S' = 1)/Afc 

S:Si = l 

= 1 — (1 — Pi,fc)P(-^fc-i-i|‘hfc-i-i<S'= 1)/Afc 
^ f 1 - (1 - P*,fc)(l - a)/Afc if Zk+i = 1, 
\l-(l-P,,fe)(l-/3)/Afc ifZfc+i = 0. 

If sensor i 0 4>fc+i, then using (fT^ we have: 


Pi,k+1 


- Ps,k +1 - ^ Ps,k +1 + ^ Ps,k +1 

S:Si=0 S:Si=0,^k+lS=l S:Si=0,^k+lS=0 

Ps,kP{Zk+i\^k+iS)/Ak+ Ps,kP{Zk+i\^k+iS)/Ak 

S:Si=0,^k+lS=l S:Si=0,iS>k+iS=0 


=-Pi,fe(l - ^k)P{Zk+i\^k+iS = 1)/Afc -I- Pi^k^kP{Zk+i\^k+iS = 0))/Afc 
— Pi,kA]^/A}^ — Pi^k 
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Figure 2: Illustration of different faults on a sinusoidal signal: (a) Spike, (b) Non-linearity, (c) mean- 
drift, (d) Excessive noise and (e) non-linear fault model 


where the fourth equality is due to the independence assumption. As a result, when i ^ ‘hfc-i-i) the 
corresponding Pi^k+i remains unchanged. 

The above computational procedure is repeated after each test, starting from some assumed initial 
prior Pifi. After k tests and given and <I>^, the sensor fault state S can be recovered in two ways: 
(1) use the maximum a posteriori probability (MAP) estimator: argmaxg P{Z^\S,^^)Ps^k-, or (2) 
declare the sensor faulty if Pi^k < o' for some predefined threshold cr, and normal otherwise. While 
both are valid, the second method is preferred as Pi^k is readily available from the above updating 
procedure, whereas the MAP estimation is computationally much more complex. The performance of 
these two methods is similar as we show in Section [71 


5 Experimental Setup 

The proposed CGT and BGT fault detection algorithms are evaluated using a set of measured vibration 
data collected by wireless sensors from the New Carquinez Bridge in California. In this section, we 
first present common sensor fault types and then detail the nature of the measured data and how it 
is used in our evaluation. 

5.1 Sensor Fault Types 

We consider four different fault types: spike, non-linear transduction, mean drift and excessive noise 
in the controlled experiments. These are illustrated in Fig. [2] on a sinusoidal signal. More specifi¬ 
cally, a spike fault is an impulse superimposed on normal sensor measurements. They are assumed 
to occur randomly in time with constant or varying magnitudes (consistent with a random signal 
model). Moreover, the occurrence of these spikes is assumed sparse. A non-linearity fault represents 
an abnormal discrepancy between the sensor input and output. This fault usually happens when the 
measurement falls outside a certain dynamic range. In this study, a simple non-linear fault model is 
used as shown in Fig. He) : when the measurement is within the normal region, the sensor output 
reflects the measurement; otherwise the output follows the slope Sj. A mean drift fault preserves the 
output dynamics but not its mean value. This type of fault generates outputs whose mean drifts away 
from the true mean of the signal slowly compared to the output dynamics. Finally, excessive noise 
refers to a large amount of Gaussian noise in the output of a sensor. Compare to regular measurement 
noise, this fault has much higher amplitude such that the output signal is highly corrupted. Note that 
only the non-linearity fault is a function of the measured signal while the other fault types are not. 
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Figure 3: Plan map of the deployed sensors. Credit: Yilan Zhang 
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Figure 4: Vibration measurement of a sensor 

5.2 Bridge Vibration Data and State Estimation 

We evaluate our detection method using bridge vibration data collected by a network of 18 vibration 
sensors deployed on the New Carquinez Bridge in California. This is a 1056-meter long suspension 
bridge which connects Crockett and Vallejo. The locations of these 18 sensors are shown in Fig. [3j 
They monitor the bridge vibration in the direction perpendicular to the bridge surface. Fig. 0] shows 
an example of the output of a sensor when vehicles pass through. We took 18 data traces at the 
beginning of the deployment and performed manual inspection. Each data trace consists of 50 seconds 
of data sampled at 200Hz. All tests, including spectrum analysis and mode-shape calculation on the 
data suggest that the data traces are correct. 

Our first task is to use the collected data to train the linear dynamical model needed in the group 
testing algorithms. For this we adopt a commonly used approach, the subspace method [30] which 
utilizes measured output (and input, if available) to calculate model parameters such as matrices A, 
(B if input data is available) and C in the state-space model (fTO]) . Notice that the excitation/input to 
the bridge is in general unavailable. While input is not necessary for learning the system model by the 
subspace method, prior study suggests the input can be assumed to be Gaussian for large structures 
with complex excitations, and that this leads to a better learned system model in terms of output 
prediction m- For our study, we use half of of the vibration data from each of the 18 traces for 
training of the bridge dynamical model, and the other half for evaluating the group testing method. 
The order of the dynamical model is set to 162 (An earlier study of the bridge, |32|, indicates that a 
162-order state space model is sufficient to capture the bridge dynamics), i.e., the length of the state 
vector is 162. The excitation inputs are assumed to be 18 degree-of-freedom Gaussian signals and 
each degree-of-freedom input has zero mean and variance equal to the variance of the output of the 
sensors. 

Two experiments are then conducted to evaluate the performance of the proposed algorithms. The 
first is a control experiment, whereby different fault types are artificially created and superimposed 
over a random subset of the data traces. The resulting data are then used for evaluation purposes. 
Specifically, we add different types of faults to the bridge data by randomly selecting up to two sensors 
(a number is first chosen uniformly from {0,1, 2}, and then number of intended faulty sensors are 
chosen uniformly among the 18 sensors). We set the maximum number of faulty sensor to be 2 (d = 2) 
so as to keep the percentage of faulty sensors around 10%. A total of 100 random runs are conducted 
(over the choice of the number and identity of the faulty sensors, as well as over the random injection 
of faults and the generation of the matrix) for experiments. 
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In addition to the control experiment, we also evaluated the CGT and the BGT algorithms on 
real sensor faults. Several weeks after deployment, sensor 11 started to exhibit errors in its data (this 
is again done by manual and visual inspection). As shown in Fig. [9l the output of sensor 11 shows 
prominent spikes beyond normal fluctuation, and possibly has a shift in the mean amplitude and a 
small mean-drift error as well. It should be noted that this observation is not the exact ground truth 
but is the closest one could get under the circumstances (the alternative is to take the sensor off the 
bridge and calibrate it in a lab; even if we could do so the result is only valid if the same type of faults 
persists in the lab setting). 


6 Performance of the Combinatorial Group Testing (CGT) Method 

In this section we evaluate the performance of the GGT algorithm. The performance in detecting 
different fault types is evaluated by control experiments. The algorithm is then evaluated on detecting 
the real faulty sensor shown in Fig. [9j Finally, we compare the GGT algorithm to non-group testing 
methods, in terms of accuracy and efficiency. 
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Figure 5: Detection and false alarm rated for Figure 6: Detection and false alarm rates for 
spike faults. non-linearity faults. 
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Figure 7: Detection and false alarm rates for Figure 8: Detection and false alarm rate for 
mean-drift faults. excessive noise. 


We first examine the performance of the GGT algorithm as a function of the detection threshold 
used in each group test and the number of tests performed. Fig. [5] shows the detection rate (the 
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Figure 9: Abnormal vibration measurement of sensor 11. 


number of detected faulty sensors over the total number of faulty sensors) and false alarm rate in 
detecting spike faults under different number of tests and threshold levels. The spike fault was set to 
appear at 5% of the samples and have mean amplitude equal to the variance of the sensor output, 
which is common among spike faults in sensors. As can be seen, as the number of tests increases, 
the detection rate increases while false alarm decreases. When 14 tests are used, the detection rate 
is above 85% and false alarm is below 1%, with a threshold of 2 x 10“^. Similarly, when 16 tests are 
used, the accuracy is over 93% and remains above 80% with a threshold less than 2 x 10“^. 

In all cases we see a fairly wide region of threshold values within which the method enjoys high 
detection rate (> 80%) and low false alarm (< 2%). This is clearly a desired operating regime for this 
method. We observe that the detection rate first and then drops slowly with increase in the threshold. 
When the threshold increases beyond a certain value (e.g., 3 x 10“^), the detection rate quickly drops 
and eventually reaches zero. The false alarm moves in the opposite direction though to a lesser degree. 
To explain this phenomenon we note there are two sources of error at play, one due to Kalman filtering 
and the other due to the recovery algorithm. When the threshold is very low, measurement noise or 
inaccuracy in the model could easily result in false positive in the the group test. These incorrect group 
test results cause the recovery algorithm to err, thus lead to both high false alarm and low detection 
rate. As the threshold increases the error from recovery decreases, which more than compensates for 
the decreased sensitivity in the group tests, achieving an overall better tradeoff. When the threshold 
increases beyond a certain level, the group test becomes insensitive to faults and eventually declares 
all groups normal, resulting in reduced detection rate and false alarm. 

The same evaluation is done for the other fault types; these are shown in Figs. [6]and[7l For the 
non-linearity fault shown in Fig. [6l the normal dynamic range is set to 80% of the output maximum, 
with a slope in the abnormal region of 0.3. For the mean-drift error shown in Fig. [71 the mean- 
drift has a maximum frequency of 5Hz and a magnitude of 50% of the sensor output variance. All 
these results show similar behavior to those observed in the spike fault case. Within the preferred 
threshold range, the detection rate generally exceeds 80% in accuracy while false alarm remains low. 
Furthermore, the preferred threshold range is smaller when the fault is less pronounced. Finally, the 
detection performance is tested when the sensor data is corrupted by excessive Gaussian noise with 
zero mean and variance equal to 50% of the variance of sensor output. The result presented in Fig. 
in shows that the proposed method is not recommended for detecting this type of fault. The poor 
performance in this case is due to the fact that Kalman filtering, in computing statistically optimal 
estimates of the system state, tends to eliminate noise variance existing in the sensor measurement. 
Consequently, zero-mean noise is sufficiently suppressed in the estimate and may not be reflected in 
the residual of a group test. 

For detecting the faulty sensor 11 shown in Fig. |9l we used our algorithm on the 18 sensors with 
6 and 8 tests respectively. Under the same preferred threshold range (between 3 x 10“^ and 1 x 10““^) 
shown in the control experiment, our algorithm was able to identify the faults in sensor 11, with a 
detection rate > 78% (> 92%) and false alarm < 1.8% (< 0.7%) when using 6 (resp. 8) tests. 

Next, the proposed combinatorial group-testing based detection method is compared to two existing 
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Figure 10: Detection rate under different measurement noises and fault types with a single faulty 
sensor. 
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Figure 11: Detection rate under different measurement noises and fault types with two faulty sensors. 


Kalman-filter based methods: Kobayashi et al. [5j and Da et al. [Sj. Both Kobayashi and Da are 
based on a bank of Kalman filters. Specifically, with N sensors in the network, N fault detection tests 
{N Kalman filters) are required to evaluate all sensors in the network. In each test, all sensors but 
one are involved, i.e., test i uses N — 1 sensors excluding sensor i. A key assumption in this method 
is that there is only one faulty sensor in the network, thus the test which does not contain the faulty 
sensor will have different characteristics than the other — 1 tests, and thus the single faulty sensor 
may be identified. 

The difference between these two methods lies in how to compare the test outcomes to determine 
the different characteristics with and without the faulty sensor. Under the method by Kobayashi, 
the estimated sensor output from the Kalman filter is compared to the corresponding observed sensor 
output. The test which does not contain the faulty sensor will have higher consistency result than 
the other tests. Under the method by Da, a reference system state estimate is generated by using all 
N sensor inputs, to which each test compares the estimated system state (from — 1 sensors). The 
test that does not contain the faulty sensor is supposed to have lower consistency result because the 
reference contains the faulty sensor while the test does not. 

Fig. [10] shows the detection rate of the three methods under different types of faults, different 
measurement noises, and with a single faulty sensor, using the same set of bridge data. As we can see, 
Kobayashi and Da’s methods achieve similar performance as our proposed method when 8 to 10 tests 
are used. This result is to be expected when the assumption of no more than one faulty sensor holds, 
since all methods are based on Kalman filtering. As shown in Section [2l the the complexity of Kalman 
filtering largely depends on the size of the system state s, rather than the number of sensors used 
in state estimation. One detection test of Da’s and Kobayashi’s algorithms has similar complexity 
as one group test of the proposed method if the sensor network size remains the same. Therefore, 
our proposed method is able to achieve similar, and sometimes better, accuracy when around 8 to 10 
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tests are used, which is about half of the complexity of Kobayashi’s and Da’s method (18 tests). The 
results in Fig. [10] also suggest, as seen earlier, that Kalman filter based fault detection systems are 
insensitive to Gaussian measurement noise. No significant degradation in the detection rate and false 
alarm is observed when the variance of the measurement noise increases from 0% of output variance 
to 30% of output variance. 

When the system has two faulty sensors, the performance of Kobayashi and Da’s methods dete¬ 
riorates sharply as the reference systems are contaminated by faulty sensor observations. If the false 
alarm rate is restricted to a reasonable level (5%), the accuracy of Da’s method drops to about 55% 
and Kobayashi’s method drops to about 50% for non-linearity fault and to about 20% for spike and 
mean drift faults (Fig. [TT]i . At the same time, the proposed algorithm maintains over 85% of accuracy 
for all fault types. Therefore, compared to other model-based methods, the proposed CGT method 
has fewer assumptions on the underlying system and the nature of the faults. It achieves high accuracy 
with much lower complexity than existing methods, which is crucial for very large sensor networks. 
Furthermore, the above comparison shows that the proposed method is insensitive to measurement 
noise. When the system has three faulty sensors, the GGT method is able to achieve high detection 
rate (90% or higher) by increasing the number of tests (25 tests for detecting spike, 24 tests for de¬ 
tecting non-linearity and 27 tests to detect mean-drift). While these exceed the size of the network 
(18 sensors), this method does not require the existence of a reference system/sensor. 

7 Performance of the Bayesian Group Testing (BGT) Method 

The fault detection performance of the BGT method is evaluated by two experiments. Under the first 
experiment we use the same bridge sensor data as we did for CGT. Hence, the CGT and the BGT 
methods can be directly compared. The second experiment uses a much larger system (with 1000 
sensors). In this context the BGT method is compared to a well-known divide-and-conquer based 
adaptive group testing method proposed by Hwang m- The impact of the initial prior, Pi^, on the 
fault detection performance is also addressed. 

7.1 Performance of BGT on the New Carquinez Bridge sensors 
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Figure 12: The fault detection performance of Figure 13: The fault detection performance of 
the CGT method and the BGT method. MAP decoder and the Pi^k based decoder. 


Fig. IT^ shows the performance comparison between BGT (with the Pi^k decoder) and CGT. Fig. [13] 
compares the two decoders for BGT: the MAP and the Pt^k decoder. Both comparisons are evaluated 
on the same New Carquinz Bridge data, where 2 out of 18 sensors are faulty. For BGT the initial 
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prior Pi^o is set to 2/18 for each sensor. The hrst test pool is randomly generated with each sensor 
having probability 1/2 of being selected. The group test error a and /3 are set to 0.01. When the Pi^k 
decoder is used, sensor i is regarded as faulty when the corresponding Pi^k is smaller than 0.2. The 
results are obtained from 50 random runs using the same setup as in the CGT evaluation in Section 

El 

As shown in Fig. IT^ BGT with the Pi^k decoder outperforms GGT on detecting all types of faults 
(when the number of tests > 6). BGT generally requires 3-4 fewer tests than the non-adaptive GGT 
for 80% detection rate. Moreover, BGT uses 8 fewer tests to reach the saturation accuracy which is 
about 50% improvement over GGT. The false alarm rates are similar (< 1%) for detecting different 
type of faults. The improvement is primarily due to two sources: BGT uses previous test results to 
design the next test, which leads to more effective tests; BGT is more conservative in deciding the 
sensor state (normal vs. faulty) and thus more robust when the group test is incorrect. 

For detecting the faulty sensor 11 shown in Fig. [9l the initial prior is set to 1/18 for all i. By 
selecting the first test pool 4>i randomly, the BGT method is able to achieve 56% detection rate (0% 
false alarm) when 5 tests are used and 100% detection rate (0% false alarm) when 6 tests are used. 
The BGT algorithm saves 2 tests compared to the GGT algorithm for the same data set. 

Fig. Us] compares the two state recover methods introduced in Section [3l On average the MAP 
method is able to save one test for achieving the same accuracy as the Pj^^-based method. However, 
the MAP method has higher false alarm when the number of tests falls below 7. Also, the Pi^k- 
based method is preferred for large scale networks due to its low complexity. Note that neither 
decoding method requires the knowledge of d, the maximum number of faulty sensors. This is a 
significant benefit over GGT if d is difficult to estimate. GGT is not able to get correct result if d 
is underestimated, due to the d-disjunct matrix requirement. Gonsequently, if d is unknown then an 
overestimate is recommended for GGT, which then leads to an over-provisioning of the number of 
tests. 

7.2 Performance of BGT in larger scale systems 

We next evaluate the performance of BGT in a large scale network (1000 sensors) and examine how 
it varies with the number of faulty sensors and group test error probabilities. A comparison between 
BGT and the divide-and-conquer adaptive group testing method proposed in m is presented. We 
note that Hwang’s method is designed for noiseless group test systems so it is not expected to work well 
with noisy group tests. Nevertheless, it is meaningful to compare the two and quantify the difference 
under both noisy and noiseless conditions. We also address the common prior initialization problem 
in Bayesian inference which also applies to BGT. 

For lack of real data on large networks, the experiments and results presented in this section are 
simulation based. Out of the 1000 sensors, d are randomly chosen and labeled as faulty. A group test 
result is hrst determined by whether the test pool contains any faulty sensors and then randomized 
according to the error model a = (3, i.e., with probability a, the test result is flipped. In other words, 
we do not actually perform Kalman hltering based detection in this set of experiments, but its effect 
is simulated via this error model. 

Hwang’s method is based on the well-known binary search m, whereby the network is hrst divided 
into 2 groups of equal size, and each is subject to the same group test process. If the result is negative, 
then all sensors in that group are declared normal removed from further testing; if the result is positive, 
then the group is further divided into two smaller groups of equal size and the same process repeats 
until all faulty sensors have been identihed. Hwang’s method has the following improvement compared 
to the standard binary search. It assumes knowledge on the number of faulty sensors d (or an upper 
bound on d), and uses d to determine the size of a group. Specihcally, when d is small compared to 
the total number of uncertain sensors, a large test pool is used. The idea is that upon a negative 
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result a large number of sensors may be declared normal, and a new test pool can be selected from the 
remaining uncertain sensors; if the result is positive, the next test pool is generated randomly from 
the entire set of uncertain sensors, including the pool just tested positive. Finally, when the number 
of remaining faulty sensors {d minus the number of detected faulty sensors) is larger than half of the 
number of remaining uncertain sensors, the test is performed on an individual basis. 

Clearly as mentioned, Hwang’s method is designed for error-free group tests, so it does not handle 
errors well. In particular, if a positive group is mistakenly detected as negative, this method will 
declare all faulty sensors in this group as normal and no further tests will be performed on them. By 
contrast, BGT only decreases the probability of each tested sensor being normal, and they may be 
tested again in the future. The comparison study here thus mainly serves to quantify the improvement 
we can achieve when taking test errors into account. 

Figs. [T^lfTBl show the performance of BGT and Hwang’s method {d = {4,10, 50} respectively) 
under various group test error rates (a). When group tests are error-free (Fig. fTT} . Hwang’s method 
is able to achieve accurate results with fewer tests than BGT. As expected, when group tests are 
noisy {a = 0.03 in Fig. [15] and a = 0.05 in Fig. [ED, BGT performs better while Hwang’s method 
deteriorates rapidly. 
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Figure 14: The comparison of BGT and 
Hwang’s methods with a = 0. 


Figure 15: The comparison of BGT and 
Hwang’s methods with a = 0.03. 




Figure 16: The comparison of the BGT 
method and Hwang’s methods with a = 0.05. 


17 


Figure 17: The performance of BGT under 
different initial priors Pi^ with the first test 
pool selected randomly. 
























A common challenge to most Bayesian inference based methods is the selection of the prior on 
the hypothesis. Under BGT, the prior probability Pi^ is required for designing a test pool. Fig. fT71 
shows the result of using different priors (Pj^o = {0.3,0.5, 0.7,0.9, 0.96}, Vi) when d = 4 in a 1000- 
sensor network with a = 0. The case Pj^o = 0.96 represents the correct prior. The figure shows that 
the performance is highly sensitive to the selection of the initial prior. However, this effect can be 
alleviated by choosing the first set of test pools randomly. We see that when the first 25 test pools 
are randomly selected (each sensor has probability 1/2 to be selected), the difference in performance 
between different initial priors are significantly reduced (Fig. fTH]) : when we increase this number to 
50 tests (Fig. fTHI) . this difference is largely eliminated. Thus this random selection at the beginning 
serves as a very simple yet effective way to counter possible bad priors. It may be seen as a form of 
exploration (random sampling) prior to exploitation (adaptive selection). 



Figure 18: The performance of BGT under 
different priors Pj^o with the first 25 test pools 
selected randomly. 



Figure 19: The performance of BGT under 
different priors Pj^o with the first 50 test pools 
selected randomly. 


To summarize, BGT is able to achieve the same performance as GGT with fewer tests, and is well 
suited for noisy group tests. Furthermore, it does not require knowledge ond when compared to GGT 
and Hwang’s method. However, the adaptive design process prevents the use of parallel computing, 
which is viable for GGT. Therefore, GGT may actually have shorter run time if parallel computing is 
used. 


8 The Design and Performance of KF-BGT 

Standard group tests are modeled as boolean operations. While both GGT and BGT work with 
noisy group tests by modeling it as boolean operations with an error probability, they do not take 
into account other possible features of the group tests. In our case, the group tests are given by the 
Kalman filtering based detection procedure, whose accuracy depends on not only the system model 
but also the test pools. This suggests that a better understanding of the relationship between the 
detection procedure and the test pool design may allow us to further improve the design of the test 
pools and in turn the accuracy of the method. This is the subject of investigation in this section. 

We note that the Kalman filter estimates the state of a system based on the system model and the 
measurements from the sensors. As system identification method is used to obtain the system model, 
the model accuracy depends on the model order (the size of the system state, S) used. A higher order 
model generally gives better model accuracy (before over-fitting occurs) but it also requires more 
computational resources for the state estimation. The dependence of the state estimate accuracy on 
the size of the test group is shown in Fig. [201 In this experiment, subgroups of different sizes are used 
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to estimates the system state. For each group size, the discrepancy 15^ — S'^loo is recorded between 
having no faulty sensors in the subgroups and having one faulty sensor in one of the subgroups. 
When there are no faulty sensors, the discrepancy \Sa — 5'b|oo is very close to zero. On the other 
hand, |5 'a — *S'b|oo is significantly larger with the presence of a single faulty sensor and increases with 
the group size. This means that if a uniform detection threshold is used, then different group sizes 
will result in significantly different detection error (i.e., group test error) probabilities. This further 
suggests that it would be desirable to maintain the same group sizes for the state estimate so as to 
keep the error probability constant and also to facilitate the choice of an optimal detection threshold. 



Figure 20: The discrepancies in state estimates under different group sizes. 

Table 2: State estimate discrepancy —Ssloo under various faulty sensor distributions. (G: Number 
of good sensors, F: Number of faulty sensor) 


8 sensors 10 sensors 


Sensor distribution 

Discrepancy 

Sensor distribution 

Discrepancy 

A:0G 4F 

B:4G OF 

8.29 

A:0G 6F 

B:6G OF 

10.78 

A:1G 3F 

B:3G IF 

23.88 

A:1G 5F 

B:5G IF 

26.73 

A:2G 2F 

B:2G 2F 

41.10 

A:2G 4F 

B:4G 2F 

46.19 




A:3G 3F 

B:3G 3F 

67.01 

A:4G OF 

B:4G OF 

7E-4 

A:6G OF 

B:6G OF 

5E-4 


We next examine the distribution of faulty sensors between two subgroups used in the filtering 
detection. Table [2] shows the state estimate discrepancy under various faulty sensor distribution in 
each subgroup. These results show that the discrepancy is highest when faulty sensors are evenly 
distributed between the two groups, e.g., having a faulty sensor in each subgroup is better than 
allocating two sensors in one subgroup as the larger discrepancy makes the detection more accurate. 

Based on the above empirical observations, we propose the Kalman filtering (KF)-enhanced group 
test (KF-BGT) that uses the following rule in addition to the operation of BGT: after a new test pool 
has been selected using BGT, divide it evenly into two subgroups. If the there are fewer than 3 sensors 
in a subgroup, then sensors with high probability of being normal outside the test pool are added to 
the subgroups before performing Kalman filtering. 

The differences in performance with and without the added sensor distribution step is illustrated 
in the following experiment. Fig. [21] shows the detection rate and false alarm for non-linearity fault 
under BGT and KF-BGT. The performance is evaluated under both order 162 and order 90 system 
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models. The performance of BGT declines significantly under a less accurate system model (smaller 
model order). In contrast, the performance of KF-BGT only deteriorate slightly, thus it improves upon 
BGT significantly when the system model is less accurate. This shows that the sensor distribution 
makes the resulting method highly robust against the quality of the system model. 


0.8 


■S 0.6 


-e— DR:order=162 BGT 
O' FA: order=162 BGT 
-A— DR: order=162 KF-BGT 
A' FA: order=162 KF-BGT 
—H— DR: order=90 BGT 
□ ■ FA: order=90 BGT 
DR: order=90 KF-BGT 
O' FA: order=90 KF-BGT 



4 6 8 

Number of tests 


Figure 21: The performance of BGT and KF-BGT under different model orders. 


9 Conclusion 

This study introduced group-testing-based sensor fault detection methods for sensor networks when 
faulty sensors are rare. We proposed a Kalman filter based group test method which is able to 
evaluate a random group of sensors and determine whether any faulty sensor is contained in the 
group. We presented a combinatorial group testing (CGT) method and a Bayesian group testing 
(BGT) method, which is adaptive and particularly suitable when group tests are noisy. We also show 
how the computation involved in BGT can be made tractable for large networks. 

Both CGT and BGT are evaluated by using a set of real vibration data collected by sensors 
deployed on the New Carquinez Bridge. Results show that both methods are able to reduce the 
number of required tests when faulty sensors are rare, compared to non-group testing methods. In 
addition, BGT uses fewer tests than CGT as it exploits previous test results in a sequential setting. 
We further propose an enhancement to BGT (KF-BGT) by taking into account features of the Kalman 
filtering process when determining subgroups in a single group test. 
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